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Amendments to the Claims 
This listing of claims will replace all prior versions of claims in the application: 
Listing of Claims: 
Claims 1-68 (Canceled) 

69. (New) A category visualization (CV) system that displays a graphic representation of each 
category as a hierarchical map, comprising: 

a node corresponding to each base category; 

nodes corresponding to combinations of similar categories; 

a leaf node corresponding to a base category, the leaf node is positioned as a cluster of nodes 
at a lowest level of the hierarchy wherein combinations of similar categories are positioned on top of 
the leaf node, forming successively higher levels of the hierarchy; 

a root node corresponding to a category that contains all records in a collection, the root node 
forms top of the hierarchy; 

a non-leaf node corresponding to each combined category, wherein similar base categories 
are combined into a combined category; and 

wherein each non-leaf node has two arcs that connect the non-leaf node to two nodes 
corresponding to sub-categories of the combined category. 

70. (New) The system of claim 69, wherein the base category is a category identified by a 
categorization process (classification and clustering). 

71 . (New) The system of claim 69, wherein the combined category is assigned the records of two 
or more base categories. 

72. (New) The system of claim 69, wherein if a node is selected, the system displays additional 
information about corresponding category, such as number of records in the category or characteristic 
attributes of the category. 
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73. (New) The system of claim 72, wherein the additional information further comprises 
characteristic and discriminating information such as attribute-value discrimination, attribute -value 
discrimination refers to how well the value of an attribute distinguishes the records of one category 
from the records of another category. 

74. (New) The system of claim 73, wherein attribute- value discrimination is determined by 
the equation: 

discrim(x i G l ,G 2 ) = (p(x i G l )-p(x i \G 2 ))log— ^±- + (p( Xi G 2 )-p(x i G l ))\og- — L-ii 



where discrim(x i \G l ,G 2 ) is the measurement of how well the value of an attribute distinguishes 
the records of one combined category from the records of another combined category, 
Gi is the first combined category, 
G 2 is the second combined category, 
Xi is the records in one of the combined categories, 

pix^G^ is the probability that a record containing specific attributes is in combined category G/, and 
p{x\G 2 ) is the probability that a record containing specific attributes is in combined category G2. 

75. (New) The system of claim 69, wherein if an arc is selected, the system displays information 
relating to categories connected by the arc, such as similarity value for the connected categories. 

76. (New) The system of claim 75, wherein similarity value refers to a rating of the differences 
between attribute values of records in one category and attribute values of records in another 
category, a high value for similarity indicates that there is little difference between the records in the 
two categories. 
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77. (New) The system of claim 76, wherein the similarity value for a pair of base categories 
is determined by the equation: 

/?(x 1 ,...,x m |A 1 ) 



dist(h x ,h 2 )= Yj (P( x i >-> x m \K ) " P( x i * m |*2 )) lo § 



p(x x ,...,x m \h 2 ) 



where dist{h y ,h 2 ) is the distance and similarity between two categories, 
xj, . . . , x m is the attribute values, 

A;, h.2 is a count of a total number of records in categories 1 and 2, 

p(xi, x m \h { ) is a conditional probability that a record has attribute values xi, ...x m given that it 
is a record from category 1 , and 

p(xi, x m \h 2 ) is a conditional probability that a record has attribute values x s , ...x m given that it 
is a record from category 2. 

78. (New) The system of claim 76, wherein the similarity for a pair of base categories is 
determined by the equation: 



dist(h v , h 2 ) = X S OK*, | A, ) - P(*, 1*2 » • 



P(Xi\h 2 ) 



where dist(h { ,h 2 ) is the distance and similarity between two categories, 
Xj is the attribute values, 

hi, h 2 is a count of a total number of records in categories 1 and 2, 

p( x ; \h y ) is a conditional probability that a record has attribute values x z given that it is a record 
from category 1 , and 

p( x ; \h 2 ) is a conditional probability that a record has attribute values x, given that it is a record 
from category 2. 
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79. (New) The system of claim 76, wherein the similarity for two combined categories is 
determined by the equation: 

dist{Gy ,G 2 )= X (p(hj )p(h k )dist{h j )p(h j , \ ) 

where dist(G l ,G 2 ) is the distance and similarity between two combined categories, 
Gi is the first combined category, 
G 2 is the second combined category, 

hj, hk is a count of a total number of records in combined categories 1 and 2, and 
p(hj)p(hk) is a probability that a record is in each of the combined categories. 

80. (New) The system of claim 76, wherein the similarity for two combined categories is 
determined by the equation: 

dist{G x ,G 2 ) = min {dist(hj )(h k )\hj eG x ,h k eG 2 } 

where dist{G x ,G 2 ) is the minimum distance between two combined categories, 

Gi is the first combined category, 

G2 is the second combined category, and 

hj, hk is a count of a total number of records in combined categories 1 and 2. 

8 1 . (New) The system of claim 76, wherein the similarity for two combined categories is 
determined by the equation: 

dist{G l ,G 2 ) = mnx{dist{h ] ){h k )\h ] e G lt h k e G 2 ) 

where dist(G l ,G 2 ) is the maximum distance between two combined categories, 

Gi is the first combined category, 

G 2 is the second combined category, and 

hp hk is a count of a total number of records in combined categories 1 and 2. 
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82. (New) The system of claim 69, wherein the graphic representation of each category is 
displayed as a decision tree, further comprising: 

nodes that correspond to each attribute of the corresponding base categories; and 
arcs that correspond to values of that attribute; 

wherein each node, except the root node, represents a setting of attribute values as indicated 
by arcs in a path from a first node to the root node. 

83. (New) The system of claim 82, wherein the selection of a node, results in display of a 
probability for each category that a record in the category will have attribute settings that are 
represented by the path. 

84. (New) A CV system that displays a graphic representation of each category as a similarity 
graph, comprising: 

a node corresponding to each category; and 
an arc that connects similar nodes; 

wherein a similarity threshold is selected and arcs are displayed between nodes 
corresponding to pairs of nodes that are above the similarity threshold; and 

wherein arcs between nodes are removed and added based upon changes to the similarity 
threshold. 

85. (New) The system of claim 84, wherein similar categories are combined. 

86. (New) The system of claim 84, wherein a category is split into sub-categories. 

87. (New) A method of calculating and displaying a graphic representation of various 
characteristics and discriminating information for each category, comprising: 

providing nodes that represent each base category; 

providing nodes that represent combined categories, wherein combinations of similar 
categories are grouped together to form the combined categories; 

utilizing a leaf node to form the bottom of the graphic representation; 
utilizing a root node to form the top of the graphic representation; 
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connecting nodes representing sub-categories of a combined category via arcs; 
combining the two base categories that are the most similar into a combined category; and 
repeating process of combining similar categories until one combined category represents all 
records in a collection. 

88. (New) The system of claim 87, further comprising de-emphasizing specific nodes and 
focusing on specific non-de-emphasized nodes. 
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